Hyper-Systolic Parallel Computing
نویسندگان
چکیده
A new class of parallel algorithms is introduced that can achieve a complexity of O(n 3 2 ) with respect to the interprocessor communication, in the exact computation of systems with pairwise mutual interactions of all elements. Hitherto, conventional methods exhibit a communicational complexity of O(n). The amount of computation operations is not altered for the new algorithm which can be formulated as a kind of h-range problem, known from the mathematical field of Additive Number Theory. We will demonstrate the reduction in communicational expense by comparing the standard-systolic algorithm and the new algorithm on the connection machine CM5 and the CRAY T3D. The parallel method can be useful in various scientific and engineering fields like exact n-body dynamics with long range forces, polymer chains, protein folding or signal processing.
منابع مشابه
Hyper-systolic algorithms for N-body computations and parallel level-3 BLAS libraries
Hyper-systolic algorithms repesent a new class of parallel computing structures. Because of their regular communication and compute patterns they are well suited for implementation on most parallel architectures, in particular, high performance SIMD machines can beneet considerably. After a short explanation of the concept of hyper-systolic algorithms, their application to N-body computations a...
متن کاملGeneralized Hyper-Systolic Algorithm
We generalize the hyper-systolic algorithm proposed in [1] for abstract data structures on massive parallel computers with np processors. For a problem of size V the communication complexity of the hyper-systolic algorithm is proportional to √ npV , to be compared with npV for the systolic case. The implementation technique is explained in detail and the example of the parallel matrix-matrix mu...
متن کاملHyper-Systolic Matrix Multiplication
A novel parallel algorithm for matrix multiplication is presented. The hyper-systolic algorithm makes use of a one-dimensional processor abstraction. The procedure can be implemented on all types of parallel systems. It can handle matrix-vector multiplications as well as transposed matrix products.
متن کاملHyper - Systolic Implementation of BLAS - 3 Routines on the APE 100 / Quadrics
Basic Linear Algebra Subroutines (BLAS-3) 1] are building blocks to solve a lot of numerical problems (Cholesky factorization, Gram-Schmidt ortonormalization, LU decomposition,...). Their eecient implementation on a given parallel machine is a key issue for the maximal exploitation of the system's computational power. In this work we refer to a massively parallel processing SIMD machine (the AP...
متن کاملHyper-Systolic Implementation of BLAS-3 Routines in the APE100/Quadrics Machine
Basic Linear Algebra Subroutines (BLAS-3) [Cho 92] are the building block to solve a lot of numerical problems (Cholesky factorization, Gram-Schmidt ortonormalization, LU decomposition,...). Their efficient implementation on a given parallel machine is a key issue for the maximal exploitation of the system computational power. In this work we refer to a massively parallel processing SIMD machin...
متن کاملHyper-Systolic Implementation of BLAS-3 Routines on the APE100/Quadrics Machine
Basic Linear Algebra Subroutines (BLAS-3) [Cho 92] are the building block to solve a lot of numerical problems (Cholesky factorization, Gram-Schmidt ortonormalization, LU decomposition,...). Their efficient implementation on a given parallel machine is a key issue for the maximal exploitation of the system computational power. In this work we refer to a massively parallel processing SIMD machin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Parallel Distrib. Syst.
دوره 9 شماره
صفحات -
تاریخ انتشار 1998